智能论文笔记

Optimization-Based Mechanical Perception for Peduncle Localization During Robotic Fruit Harvest

Miranda Cravetz , Cindy Grimm , Joseph R. Davidson

分类：机器人

2022-09-27

全球粮食需求和严峻的工作条件的上升使水果收获成为自动化的重要领域。对于任何自动化的水果收获系统来说，花梗定位是重要的步骤，因为水果分离技术对花梗位置高度敏感。大多数关于花梗本地化的工作都集中在计算机视觉上，但是由于农业环境的混乱性，花梗很难在视觉上访问。我们的工作提出了一种替代机械（而不是视觉）感知来定位花梗的替代方法。为了估算这一重要植物特征的位置，我们将扳手测量从腕部力/扭矩传感器到水果植物系统的物理模型，将水果的附着点视为要调整的参数。该方法是作为水果采摘程序的一部分进行内联执行的。使用我们的果园代理进行评估，我们证明了该技术能够将花梗定位在3.8 cm的中间距离内，中位方向误差为16.8度。

translated by 谷歌翻译

Multi hash embeddings in spaCy

Lester James Miranda , Ákos Kádár , Adriane Boyd , Sofie Van Landeghem , Anders Søgaard , Matthew Honnibal

分类：自然语言处理

2022-12-19

The distributed representation of symbols is one of the key technologies in machine learning systems today, playing a pivotal role in modern natural language processing. Traditional word embeddings associate a separate vector with each word. While this approach is simple and leads to good performance, it requires a lot of memory for representing a large vocabulary. To reduce the memory footprint, the default embedding layer in spaCy is a hash embeddings layer. It is a stochastic approximation of traditional embeddings that provides unique vectors for a large number of words without explicitly storing a separate vector for each of them. To be able to compute meaningful representations for both known and unknown words, hash embeddings represent each word as a summary of the normalized word form, subword information and word shape. Together, these features produce a multi-embedding of a word. In this technical report we lay out a bit of history and introduce the embedding methods in spaCy in detail. Second, we critically evaluate the hash embedding architecture with multi-embeddings on Named Entity Recognition datasets from a variety of domains and languages. The experiments validate most key design choices behind spaCy's embedders, but we also uncover a few surprising results.

translated by 谷歌翻译

Discovering Language Model Behaviors with Model-Written Evaluations

Ethan Perez , Sam Ringer , Kamilė Lukošiūtė , Karina Nguyen , Edwin Chen , Scott Heiner , Craig Pettit , Catherine Olsson , Sandipan Kundu , Saurav Kadavath

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

As language models (LMs) scale, they develop many novel behaviors, good and bad, exacerbating the need to evaluate how they behave. Prior work creates evaluations with crowdwork (which is time-consuming and expensive) or existing data sources (which are not always available). Here, we automatically generate evaluations with LMs. We explore approaches with varying amounts of human effort, from instructing LMs to write yes/no questions to making complex Winogender schemas with multiple stages of LM-based generation and filtering. Crowdworkers rate the examples as highly relevant and agree with 90-100% of labels, sometimes more so than corresponding human-written datasets. We generate 154 datasets and discover new cases of inverse scaling where LMs get worse with size. Larger LMs repeat back a dialog user's preferred answer ("sycophancy") and express greater desire to pursue concerning goals like resource acquisition and goal preservation. We also find some of the first examples of inverse scaling in RL from Human Feedback (RLHF), where more RLHF makes LMs worse. For example, RLHF makes LMs express stronger political views (on gun rights and immigration) and a greater desire to avoid shut down. Overall, LM-written evaluations are high-quality and let us quickly discover many novel LM behaviors.

translated by 谷歌翻译

Speech Aware Dialog System Technology Challenge (DSTC11)

Hagen Soltau , Izhak Shafran , Mingqiu Wang , Abhinav Rastogi , Jeffrey Zhao , Ye Jia , Wei Han , Yuan Cao , Aramys Miranda

分类：人工智能

2022-12-16

Most research on task oriented dialog modeling is based on written text input. However, users interact with practical dialog systems often using speech as input. Typically, systems convert speech into text using an Automatic Speech Recognition (ASR) system, introducing errors. Furthermore, these systems do not address the differences in written and spoken language. The research on this topic is stymied by the lack of a public corpus. Motivated by these considerations, our goal in hosting the speech-aware dialog state tracking challenge was to create a public corpus or task which can be used to investigate the performance gap between the written and spoken forms of input, develop models that could alleviate this gap, and establish whether Text-to-Speech-based (TTS) systems is a reasonable surrogate to the more-labor intensive human data collection. We created three spoken versions of the popular written-domain MultiWoz task -- (a) TTS-Verbatim: written user inputs were converted into speech waveforms using a TTS system, (b) Human-Verbatim: humans spoke the user inputs verbatim, and (c) Human-paraphrased: humans paraphrased the user inputs. Additionally, we provided different forms of ASR output to encourage wider participation from teams that may not have access to state-of-the-art ASR systems. These included ASR transcripts, word time stamps, and latent representations of the audio (audio encoder outputs). In this paper, we describe the corpus, report results from participating teams, provide preliminary analyses of their results, and summarize the current state-of-the-art in this domain.

translated by 谷歌翻译

Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness

Caner Hazirbas , Yejin Bang , Tiezheng Yu , Parisa Assar , Bilal Porgali , Vítor Albiero , Stefan Hermanek , Jacqueline Pan , Emily McReynolds , Miranda Bogen

分类：计算机视觉 | 人工智能 | 自然语言处理

2022-11-10

Developing robust and fair AI systems require datasets with comprehensive set of labels that can help ensure the validity and legitimacy of relevant measurements. Recent efforts, therefore, focus on collecting person-related datasets that have carefully selected labels, including sensitive characteristics, and consent forms in place to use those attributes for model testing and development. Responsible data collection involves several stages, including but not limited to determining use-case scenarios, selecting categories (annotations) such that the data are fit for the purpose of measuring algorithmic bias for subgroups and most importantly ensure that the selected categories/subcategories are robust to regional diversities and inclusive of as many subgroups as possible. Meta, in a continuation of our efforts to measure AI algorithmic bias and robustness (https://ai.facebook.com/blog/shedding-light-on-fairness-in-ai-with-a-new-data-set), is working on collecting a large consent-driven dataset with a comprehensive list of categories. This paper describes our proposed design of such categories and subcategories for Casual Conversations v2.

translated by 谷歌翻译

On the Generalization of Deep Reinforcement Learning Methods in the Problem of Local Navigation

Victor R. F. Miranda , Armando A. Neto , Gustavo M. Freitas , Leonardo A. Mozelli

分类：机器人 | 机器学习

2022-09-28

在本文中，我们研究了DRL算法在本地导航问题的应用，其中机器人仅配备有限量距离的外部感受传感器（例如LIDAR），在未知和混乱的工作区中朝着目标位置移动。基于DRL的碰撞避免政策具有一些优势，但是一旦他们学习合适的动作的能力仅限于传感器范围，它们就非常容易受到本地最小值的影响。由于大多数机器人在非结构化环境中执行任务，因此寻求能够避免本地最小值的广义本地导航政策，尤其是在未经训练的情况下，这是非常兴趣的。为此，我们提出了一种新颖的奖励功能，该功能结合了在训练阶段获得的地图信息，从而提高了代理商故意最佳行动方案的能力。另外，我们使用SAC算法来训练我们的ANN，这表明在最先进的文献中比其他人更有效。一组SIM到SIM和SIM到现实的实验表明，我们提出的奖励与SAC相结合的表现优于比较局部最小值和避免碰撞的方法。

translated by 谷歌翻译

The Folded Pneumatic Artificial Muscle (foldPAM): Towards Programmability and Control via End Geometry

Sicheng Wang , Eugenio Frias Miranda , Laura H. Blumenschein

分类：机器人

2022-09-03

软气动执行器已经在许多软机器人系统中看到了应用，其压力驱动的性质提出了控制其运动的独特挑战和机会。在这项工作中，我们提出了一个新概念：通过末端几何形状设计和控制气动执行器。我们演示了一个新颖的执行器类，称为折叠气动人造肌肉（Foldpam），该肌肉具有一个薄纤维的空气袋，两侧对称折叠。改变执行器的折叠部分会改变最终约束，从而改变力 - 应变关系。我们通过测量具有各种长度和折叠量的单个foldpam单元的力 - 应变关系来实验研究这一变化。除静态几何单元外，驱动的FOLDPAM设备还设计为产生末端几何形状的连续，按需调整，从而实现闭环位置控制，同时保持恒定压力。使用设备的实验表明几何控制允许进入力 - 应变平面上的不同区域，并且闭环几何控制可以在驱动范围的0.5％以内实现误差。

translated by 谷歌翻译

Nonlinear desirability theory

Enrique Miranda , Marco Zaffalon

分类：人工智能

2022-09-01

可渴望可以理解为ANSCOMBE和AUMANN的贝叶斯决策理论的扩展，以延伸到预期公用事业集。可取性的核心在于测量奖励的量表线性的假设。它是一个传统的假设，用于得出预期的效用模型，该模型与理性决策的一般表示相冲突。尤其是，阿莱斯（Allais）在1953年以著名的悖论指出了这一点。我们注意到，当我们将可取性视为逻辑理论时，公用事业量表起着封闭操作员的作用。该观察结果使我们能够通过通用闭合操作员表示实用程序量表来扩展到非线性情况。新理论直接以实际的非线性货币（货币）表达了奖励，这在野蛮的精神上很大程度上表达，同时可以说将基础假设削弱到最低限度。我们从一组赌博及其上价和高价（预防）的角度来表征新理论的主要特性。我们展示了Allais悖论如何在新理论中找到解决方案，并讨论了该理论中概率集的作用。

translated by 谷歌翻译

SNGuess: A method for the selection of young extragalactic transients

N. Miranda , J. C. Freytag , J. Nordin , R. Biswas , V. Brinnel , C. Fremling , M. Kowalski , A. Mahabal , S. Reusch , J. van Santen

分类：机器学习

2022-08-13

随着天文学中检测到的瞬变数量的迅速增加，基于机器学习的分类方法正在越来越多地使用。他们的目标通常是要获得瞬态的确定分类，并且出于良好的性能，他们通常需要存在大量观察。但是，精心设计，有针对性的模型可以通过更少的计算资源来达到其分类目标。本文介绍了Snguess，该模型旨在找到高纯度附近的年轻外乳旋转瞬变。 Snguess可以使用一组功能，这些功能可以从天文警报数据中有效计算。其中一些功能是静态的，并且与警报元数据相关联，而其他功能必须根据警报中包含的光度观测值计算。大多数功能都足够简单，可以在其检测后的瞬态生命周期的早期阶段获得或计算。我们为从Zwicky Transient设施（ZTF）的一组标记的公共警报数据计算了这些功能。 Snguess的核心模型由一组决策树组成，这些集合是通过梯度提升训练的。 SNGUESS建议的候选人中约有88％的ZTF从2020年4月至2021年8月的一组警报中被发现是真正的相关超新星（SNE）。对于具有明亮检测的警报，此数字在92％至98％之间。自2020年4月以来，Snguess确定为ZTF Alert流中潜在SNE的瞬变已发布到AMPEL_ZTF_NEW组标识符下的瞬态名称服务器（TNS）。可以通过Web服务访问ZTF观察到的任何暂时性的SNGUESS分数。 Snguess的源代码可公开使用。

translated by 谷歌翻译

The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence

Brando Miranda , Patrick Yu , Yu-Xiong Wang , Sanmi Koyejo

分类：机器学习

2022-08-02

最近，已经观察到，转移学习解决方案可能是我们解决许多少量学习基准的全部 - 因此提出了有关何时以及如何部署元学习算法的重要问题。在本文中，我们试图通过1.提出一个新颖的指标（多样性系数）来阐明这些问题，以测量几次学习基准和2.的任务多样性。）并在公平条件下进行学习（相同的体系结构，相同的优化器和所有经过培训的模型）。使用多样性系数，我们表明流行的迷你胶原和Cifar-fs几乎没有学习基准的多样性低。这种新颖的洞察力将转移学习解决方案比在公平比较的低多样性方面的元学习解决方案更好。具体而言，我们从经验上发现，低多样性系数与转移学习和MAML学习解决方案之间的高相似性在元测试时间和分类层相似性方面（使用基于特征的距离指标，例如SVCCA，PWCCA，CKA和OPD））。为了进一步支持我们的主张，我们发现这种元测试的准确性仍然存在，即使模型大小变化也是如此。因此，我们得出的结论是，在低多样性制度中，MAML和转移学习在公平比较时具有等效的元检验性能。我们也希望我们的工作激发了对元学习基准测试基准的更周到的结构和定量评估。

translated by 谷歌翻译